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^ I Abstract 

In this paper, we consider the problem of sparse recovery from nonUnear measurements, which has 
' appHcations in state estimation and bad data detection for power networks. An iterative mixed £i and £2 

^ ' convex programming is used to estimate the true state by locally linearizing the nonlinear measurements. 
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When the measurements are linear, through using the almost Euclidean property for a linear subspace, 
we derive a new performance bound for the state estimation error under sparse bad data and additive 
observation noise. When the measurements are nonlinear, we give conditions under which the solution of 



> 
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' the iterative algorithm converges to the true state even though the locally linearized measurements may 



not be the actual nonlinear measurements. We also numerically evaluate an iterative convex programming 
approach to perform bad data detections in nonhnear electrical power networks problems. As a byproduct, 
in this paper we provide sharp bounds on the almost Euclidean property of a linear subspace, using the 
"escape-through-a-mesh" theorem from geometric functional analysis. 

I. Introduction 

In this paper, inspired by state estimation for nonlinear electrical power networks under bad data and 
additive noise, we study the problem of sparse recovery from nonlinear measurements. The static state of 
an electric power network can be described by the vector of bus voltage magnitudes and angles. However, 
in smart grid power networks, the measurement of these quantities can be corrupted due to errors in the 
sensors, communication errors in transmitting the measurement results, and adversarial compromises of 
the meters. In these settings, the observed measurements contain abnormally large measurement errors. 

Part of this paper was presented in the 50th IEEE Conference on Decision and Control 2011 [18| . 
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called bad data, in addition to the usual additive observation noise. So the state estimation of power 
networks needs to detect, identify, and eliminate these large measurement errors lEl, |[T4l . ||3l. To make 
the problem more challenging, the measurements in power networks are generally nonlinear functions of 
the states. This motivates us to study the general problem of state estimation from nonlinear measurements 
and bad data. 

In general, suppose that we make n measurements to estimate the state x described by an m- 
dimensional (m < n) real-numbered vector, then these measurements can be written as an n-dimensional 
vector y, which is related to the state vector through the measurement equation 

y = /i(x)+v + e, (I.l) 

where /i(x) is a set of n general functions, which may be linear or a nonlinear, and v is the vector of 
additive measurement noise, and e is the vector of bad data imposed on the measurements. In this paper, 
we assume that v is an m-dimensional vector with i.i.d. zero mean Gaussian elements of variance a"^. 
We also assume that e is a vector with only k nonzero entries, and the nonzero entries can take arbitrary 
real-numbered values, reflecting the nature of bad data. 

When there are no bad data present, it is well known that the Least Square (LS) method can be used 
to suppress the effect of observation noise on state estimations. In the LS method, we try to find a vector 
X minimizing 

l|y-/i(x)||2. (1.2) 

However, the LS method generally only works well when there are no bad data e corrupting the 
observation y. If the magnitudes of bad data are large, the estimation result can be very far from the 
true state. So bad data detection to eliminate abnormal measurements is needed when there are bad data 
present in the measurement results. 

Since the probability of large measurement errors occurring is very small, it is reasonable to assume 
that bad data are only present in a small fraction of the n available meter measurements results. So 
bad data detection in power networks can be viewed as a sparse error detection problem, which shares 
similar mathematical structures as sparse recovery problem in compressive sensing [Sl, 0- However, 
this problem in power networks is very different from ordinary sparse error detection problem iQ. In 
fact, /i(x) in (11.11 ) is a nonlinear mapping instead of a linear mapping as in [4|. It is the goal of this 
paper to provide a sparse recovery algorithm and performance analysis for sparse recovery from nonlinear 
measurements with applications in bad data detection for electrical power networks. 
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We first consider the simplified problem when /i(x) is linear, which serves as a basis for solving and 
analyzing the sparse recovery problem with nonlinear measurements. For this sparse recovery problem 
with linear measurements, a mixed least ii norm and least square convex programming is used to 
simultaneously detect bad data and subtract additive noise from the observations. In our theoretical 
analysis of the decoding performance, we assume h{x.) is a linear transformation ifx, where H is an 
nxm matrix with i.i.d. standard zero mean Gaussian entries. Through using the almost Euclidean property 
for the linear subspace generated by H, we derive a new performance bound for the state estimation error 
under sparse bad data and additive observation noise. In our analysis, using the "escape-through-a-mesh" 
theorem from geometric functional analysis |8|, we are able to significantly improve on the bounds for the 
almost Euclidean property of a linear subspace, which may be interesting in a more general mathematical 
setting. Compared with earlier analysis on the same optimization problem in H, the analysis in this paper 
is new using the almost Euclidean property rather than the restricted isometry conditions used in Q, and 
we are able to give explicit bounds on the error performance, which is generally sharper than the result 
in H in terms of recoverable sparsity. 

Generalizing the algorithm and results from linear measurements, we propose an iterative convex pro- 
gramming approach to perform joint noise reduction and bad data detection from nonlinear measurements. 
We establish conditions under which the iterative algorithm converges to the true state in the presence of 
bad data even when the measurements are nonlinear. Our iterative convex programming based algorithm is 
shown to work well in this nonlinear setting by numerical examples. Compared with |[T2l . which proposed 
to apply ii minimization in bad data detection in power networks, our approach offers a better decoding 
error performance when both bad data and additive observation noise are present. ifTOllfTTTl considered 
state estimations under malicious data attacks, and formulated state estimation under malicious attacks 
as a hypothesis testing problem by assuming a prior probability distribution on the state x. In contrast, 
our approach does not rely on any prior information on the signal x itself, and the performance bounds 
hold for an arbitrary state x. Compressive sensing with nonlinear measurements were studied in lH by 
extending the restricted isometry condition. Our sparse recovery problem is different from the compressive 
sensing problem considered in [H since our measurements are overcomplete and are designed to perform 
sparse error corrections instead of compressive sensing. Our analysis also does not rely on extensions of 
the restricted isometry condition. 

The rest of this paper is organized as follows. In Section |lll we study joint bad data detection and 
denoising for linear measurements, and derive the performance bound on the decoding error based on 
the almost Euclidean property of linear subspaces. In Section |llll a sharp bound on the almost Euclidean 
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property is given through the "escape-through-mesh" theorem. In Section |IVl we present explicitly 
computed bounds on the estimation error for linear measurements. In Section |Vl we propose our iterative 
convex programming algorithm to perform sparse recovery from nonlinear measurements, give theoretical 
analysis on the performance guarantee of the iterative algorithm, and give an example to illustrate the 
algorithm and performance bounds. In Section |VlJ we present simulation results of our iterative algorithm 
to show its performance in power networks. Section IVIII concludes this paper. 

II. Bad Data Detection for Linear Systems 

In this section, we introduce a convex programming formulation to do bad data detection in linear 
systems, and characterize its decoding error performance. In a linear system, the corresponding n x 1 
observation vector in dl.ll ) is y = Hx. + e + v, where x is an m x 1 signal vector (m < n), H is an 
nxm matrix, e is a sparse error vector with k nonzero elements, and v is a noise vector with ||v||2 < e. 
In what follows, we denote the part of any vector w over any index set K as wk- 

We solve the following optimization problem involving optimization variables x and z, and we then 
estimate the state x to be x, which is the optimizer value for x. 



mill ||y — Hx — z\ 



x,z 



subject to ||z||2 < e. (H-l) 

This optimization problem appeared in a slightly different form in H by restricting z in the null space 
of H. We are now ready to give a theorem which bounds the decoding error performance of dll. 1 1 |. using 
the almost Euclidean property ||6l, |[9l . 

Theorem 2.1: Let y, H, x, e and v are specified as above. Suppose that the minimum nonzero singular 
value of H is (Jmin. Let C be a real number larger than 1, and suppose that every vector in range of 
the matrix H satisfies C||wi^||i < ||w;f^||i for any subset K C {1,2, ...,n} with cardinality \K\ < k, 
where k is an integer, and K = {1, 2, n} \ K. We also assume the subspace generated by H satisfies 
the almost Euclidean property for a constant a < 1, namely 

a-v/n||w||2 < ||w||i 

holds true for every in the subspace generated by H. 
Then the solution x to (III. II ) satisfies 

||._.||,<^(£±i) „,.2, 

0-mina(C - 1) 
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Proof: Suppose that one optimal solution pair to (III. 1 b is (x, z). Since ||z||2 < e, we have ||z||i < 
\/n||z||2 < ^/ne. 

Since x and z = v are feasible for (III. 1 1 ) and y = ffx + e + v, then 

l|y - Hx - z||i 

= ||//(x — x) + e + V — z||i 
< ||i?(x - x) + e + V - v||i 
= l|e||i- 

Applying the triangle inequality to ||//(x — x) + e + v — z||i, we further obtain 

||-f^'(x — x) + — ||v||i — ||z||i < ||e||i. 

Denoting //(x — x) as w, because e is supported on a set K with cardinality < k, by the triangle 
inequality for £i norm again, 

||e||i - ||waH|i + llw^^lli - ||v||i - ||z||i < ||e||i. 

So we have 

- llwA-lli + ||w;^||i < ||z||i + ||v||i < 2y/ne (11.3) 

With Cllwii-lli < ||vif-j^||i, we know 

C- 1 

(J j^l ll^lli ^ -llwE-lli + llw^^lli. 

Combining this with ( III.3I) . we obtain 

C-lu u r- 
fj^^ l|w||i < 2Vne. 

By the almost Euclidean property a\/n|l'*^ll2 < l|w||i, it follows: 



a(C- 1) 

By the definition of singular values, 



2(C7 + 1) 

w||2 < ^e. (II.4) 



0-min||x - x||2 < ||F(x - x)||2 = ||w||2, (II.5) 
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SO combining (III.4b . we get 

II "11 ^ 2(^ + 1) 
X - X 2 < 77^ rre. 

0-mina(C - 1) 

■ 

Note that when there are no sparse errors present, the decoding error bound using the standard LS 
method satisfies ||x — x||2 < [4J. Theorem 12.11 shows that the decoding error bound of (III. lb is 

oblivious to the amplitudes of these bad data. This phenomenon was also observed in ||4l by using the 
restricted isometry condition for compressive sensing. 

We remark that, for given y and e, by strong Lagrange duality theory, the solution x to (III. 1 1 ) corresponds 
to the solution to x in the following problem (III.6I ) for some Lagrange dual variable A > 0. 

min ||y — ifx — z||i + A||z||2. (II.6) 

x,z 

In fact, when A — )• 00, the optimizer ||z||2 — )• 0, and (III.6I) approaches 

min ||y — i?x||i, 

X 

and when A — )• 0, the optimizer z — )• y — H-k, and (III. 61 ) approaches 

min ||y — -H"x||2. 

X 

Thus, ( III.6I ) can be viewed as a weighed version of £1 minimization and £2 minimization (or equivalently 
the LS method). We will later use numerical experiments to show that in order to recover a sparse vector 
from measurements with both noise and errors, this weighted version outperforms both ii minimization 
and the LS method. 

In the next two sections, we aim at explicitly computing ^ "^.^aic^-i) ^ appearing in the error bound 
(III.2I ). which is subsequently denoted as w in this paper. The appearance of the ^/n factor is to compensate 
for the energy scaling of large random matrices and its meaning will be clear in later context. We first 
compute explicitly the almost Euclidean property constant a, and then use the almost Euclidean property 
to get a direct estimate of the constant C in the error bound (III.2I ). 

III. Bounding the Almost Euclidean Property 

In this section, we would like to give a quantitative bound on the almost Euclidean property constant 
a such that with high probability (with respect to the measure for the subspace generated by random H), 
a^/n\\w\\2 < ||w||i holds for every vector w from the subspace generated by H. Here we assume that 
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each element of H is generated from the standard Gaussian distribution A^(0, 1). Hence the subspace 
generated by H is a uniformly distributed (n — m) -dimensional subspaces. 

To ensure that the subspace generated from H satisfies the almost Euclidean property with a > 0, we 
must have the event that the subspace generated by H does not intersect the set {w G S^^^HIwHi < 
a-^/n||vif||2}, where is the unit Euclidean sphere in R"-. To evaluate the probability that this event 

happens, we will need the following "escape-through-mesh" theorem. 

Theorem 3.1: [81 Let be a subset of the unit Euclidean sphere S"^^ in R^. Let y be a random Tri- 
dimensional subspace of i?", distributed uniformly in the Grassmanian with respect to the Haar measure. 
Let us further take w{S)-E{sup^^g(\i^w)), where h is a random column vector in i?" with i.i.d. 
A^(0, 1) components. Assume that w{S) < {yjn — m — ^^^i— ). Then 



P(yp|5 = 0) > l-3.5e" 

From Theorem 13.11 we can use the following programming to get an estimate of the upper bound 
of w{h,S) = sup^g5(h-^w). Because the set {w G S"""-*^! ||w||i < a-v/ri||vif||2} is symmetric, without 
loss of generality, we assume that the elements of h follow i.i.d. half-normal distributions, namely the 
distribution for the absolute value of a standard zero mean Gaussian random variables. With hi denoting 
the i-th element of h, sup^g5(h^w) is equivalent to 

n 

max '^hiHi (HI.l) 

i=l 

subject to yo > 0, 1 < i < n (III.2) 



^yi< aV^ (111.3) 

i=l 
n 

j;y' = i- (111.4) 

i=l 

Following the method from ifTTl . we use the Lagrange duality to find an upper bound for the objective 
function of M7l\i . 

n 

mill maxh"^w — ^1(7^ wf — 1) 

Mi>0,n2>0,A>0 u) 

i=l 

n n 

-U2(^ Wi - ay/n) + ^ XiWi, (III.5) 

1=1 i=l 

where A is a vector (Ai, A2, A„). 

First, we maximize dlll.SI ) over wi, i = 1,2, ...,n for fixed ui, U2 and A. By making the derivatives to 
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be zero, the minimizing Wi is given by 



Wi 



hj + Xj- U2 
2ui 



1 < i < n 



Plugging this back to the objective function in (IIII.5I ). we get 

n 



i=l 



Wi 



i=l 



i=l 



Er=i (-^2 + A, + 

4ui 



+ til + a\/nu2. 



Next, we minimize (IIII.6b over ui > 0. It is not hard to see the minimizing is 



VTh=i (-""2 + Xi + hi 



(III.6) 



and the corresponding minimized value is 



^ (-1*2 + Aj + + a^/nu2. 
\ i=i 



(111.7) 



Then, we minimize (IIII.7I ) over A > 0. Given h and U2 > 0, it is easy to see that the minimizing A is 



U2 - hi if hi < U2; 
otherwise, 



and the corresponding minimized value is 



/ ^ {u2 - hiY + ay/nu2. 

l<i<n:hi <U2 



(III.8) 



Now if we take any U2 > 0, (Illl.Sb serves as an upper bound for (IIII.5I ). and thus also an upper bound 
for sup^g5(h^w). Since is a concave function, by Jensen's inequality, we have for any given U2 > 0, 



E{sup{h^w))< E{ y {U2 - hY} + a,/^U2. 

y l<i<n:hi<U2 

Since h has i.i.d. half-normal components, the righthand side of ( IIII.9I ) equals to 



(111.9) 



{ul + l)erfc(u2/\/2) - y2/Vti2e-"i/2 + au2)Vn, 



(ni.io) 
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Fig. 1: a* over m/n 



where erfc is the error function. 

One can check that (IIII.IOI ) is convex in U2- Given a, we minimize (IIII.lOll over ?i2 > and let g{a)-/n 
denote the minimum value. Then from (IIII.9I ) and (IIII.IOI ) we know 

w{S) = E(sup(h^w)) < g{a)^/^. (III.ll) 

Given 5 = ^, we pick the largest a* such that g{a*) < — 5. Then as n goes to infinity, it holds that 

w{S) < g{a)y/^ <{y/n-m 7^=)- (III.12) 

2^/n — m 

Then from Theorem 13.11 with high probability WwWi > a*\/^l|w||2 holds for every vector w in the 
subspace generated by H. We numerically calculate how a* changes over 6 and plot the curve in Fig. 
[U For example, when 6 = 0.5, a* = 0.332, thus ||w||i > 0.332^/n||vif||2 for all w in the subspace 
generated by H. 

Note that when ^ = |> we get a = 0.332. That is much larger than the known a used in fT9l, which is 
approximately 0.07 (see Equation (12) in ||T9ll ). When applied to the sparse recovery problem considered 
in ||T9l . we will be able to recover any vector with no more than 0.0289n = 0.0578m nonzero elements, 
which are 20 times more than the bound in fl9l . 

IV. Evaluating the Robust Error Correction Bound 

If the elements in the measurement matrix H are i.i.d. as the unit real Gaussian random variables 
A^(0, 1), following upon the work of Marchenko and Pastur |[T3l . Geman n and Silvers tein |[T6ll proved 
that for m/n = (5, as n — )• oo, the smallest nonzero singular value 

1 



n 



1-V6 
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almost surely as n — )• oo. 

Now that we have already explicitly bounded a and cJmin, we now proceed to characterize C. It turns 
out that our earlier result on the almost Euclidean property can be used to compute C. 

Lemma 4.1: Suppose an n-dimensional vector w satisfies ||w||i > av^||w^||2, and for some set K C 
{l,2,...,n} with cardinality \K\ = k < n, ^ ||^|||^ = (3. Then (3 satisfies 

k n — k ~ a^n 



Proof: Without loss of generality, we let ||w||i = 1. Then by the Cauchy-Schwarz inequality, 

l|w||i = llwi^lll + ||w-^||| 

V vn — k 



At the same time, by the almost Euclidean property. 



a^n||w||2 < ||w||^, 



so we must have 

/32 ^ (1-/3)2 1 



-r + ^ < 



n — k a^n 

■ 

Corollary 4.2: If a nonzero n-dimensional vector w satisfies ||w||i > a-v/n||w||2, and if for any set 
K C {1,2, ...,n} with cardinality \K\ = k < n, C||wx||i = ||w;^||i for some number C > 1, then 



k (5 + 1 - C2) - J{B + 1 - C2)2 - AB 
n 2B 



where B 



[C+lf 



Proof: If Cllwii-lli = ||w-jj-||i, we have 



||w||i C + 1 

So by Lemma 14.11 /3 = ^=q-j- satisfies 



k n — k a^n 
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Fig. 2: VD versus - 



This is equivalent to 

1 (C + l)2 
— I < — 

k I — h. q;2 

n n 

Solving this inequality for ^, we get (IIV.II) . ■ 

So for a sparsity ratio ^, this corollary can be used to find a lower bound on C satisfying ^ ||^||^ = 
Combining these results on computing crmin> ol and C, we can then compute the bound ^ a(C-i) ~ ^ 
in Theorem 12. II For example, when 5 = ^ = |, we plot the bound as a function of ^ in Fig. |2] 

V. Sparse Error Correction from Nonlinear Measurements 

In applications, measurement outcome can be nonlinear functions of system states. Let us denote the 
i-th measurement by /ij(x), where 1 < i <n and /ii(x) can be a nonlinear function of x. In this section, 
we study the theoretical performance guarantee of sparse recovery from nonlinear measurements and 
give an iterative algorithm to do sparse recovery from nonlinear measurements, for which we provide 
conditions under which the iterative algorithm converges to the true state. 

In Subsection |V-A[ we explore the conditions under which sparse recovery from nonlinear measure- 
ments are theoretically possible. In Subsection IV-BI we describe our iterative algorithm to perform 
sparse recovery from nonlinear measurements. In Subsection IV-CI we study the algorithm performance 
guarantees when the measurements are with or without additive noise. In Subsection IV-Dl we give an 
example to illustrate our algorithm and analysis. 

A. Theoretical Guarantee for Direct Iq and ii-Minimization 

We first give a general condition which guarantees recovering correctly the state x from the corrupted 
observation y without considering the computational cost. 
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Theorem 5.1: Let y, h{-), x, H, and e be specified as above; and y = /i(x) + e. A state x can be 
recovered correctly from any error e with ||e||o < k from solving the optimization 

min ||y-/i(x)||o, (V.l) 

X 

if and only if for any x* / x, ||/i(x) - /i(x*)||o >2k + l. 

Proof: We first prove the sufficiency part, namely if for any x* 7^ x, ||/i(x) — /i(x*)||o >2k + l, we 
can always correctly recover x from y corrupted with any error e with ||e||o < k. Suppose that instead 
an solution to the optimization problem dV.ll ) is an x* 7^ x. Then 

l|y-Mx*)||o 

= ||(Mx) + e)-Mx*)||o 

> ||Mx)-Mx*)||o-||e||o 

> {2k + l)-k 

> l|e||o = ||y - /i(x)||o- 

So X* 7^ X can not be a solution to (IV. lb . which is a contradiction. 

For the necessary part, suppose that there exists an x* 7^ x such that ||/i(x) — /i(x*)||o < 2k. Let / be 
the index set where /i(x) and /i(x*) differ and its size |/| < 2k. Let 7 = /i(x*) — /i(x). We pick e such 
that e-i = ji, yi e I', where I' C I is an index set with cardinality |/'| = k; and to be otherwise. 
Then 

l|y-Mx*)||o 

= ||/i(x)-/i(x*)+e||o 

< A; = ||e||o = ||y - /i(x)||o, 

which means that x can not be a solution to (IV. 11 ) and is certainly not a unique solution to dV.ll ). ■ 
Theorem 5.2: Let y, /i(-), x, H, and e be specified as above; and and y = /i(x) + e. A state x can 
be recovered correctly from any error e with ||e||o < k from solving the optimization 

min ||y - /i(x)||i, (V.2) 
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if and only if for any x* / x, ||(/i(x) — h(x*))K\\i < ||(^(x) — /i(x*))-j^||i, where K is the support of 
the error vector e. 

Proof: We first prove if any x* / x, ||(/i(x) - /i(x*))i^||i < ||(/i(x) - /i(x*));^||i, where K is 
the support of the error vector e, we can correctly recover state x from (IV.2b . Suppose that instead an 
solution to the optimization problem dV.ll ) is an x* / x. Then 

l|y-Mx*)||i 

= ||(Mx)+e)-/i(x*)||i 

= Ilex - (/i(x*) - /i(x))x||i + ||(/i(x*) - /i(x));^||i 

> llexlli - ||(/i(x*) - /i(x))x||i + ||(/i(x*) - /i(x));^||i 

> l|e/<||i = ||y - /i(x)||i. 

So X* 7^ X can not be a solution to (lV.2b . and this leads to a contradiction. 

Now suppose that there exists an x* 7^ x such that ||(/i(x) — /i(x*))x||i > ||(^(x) — /i(x*))-^||i, where 
K is the support of the error vector e. Then we can pick e to be (/i(x*) — h{'x.))K over its support K 
and to be over K. Then 

l|y-Mx*)||i 

= ||/i(x) -/i(x*) + e||i 

< ||(/i(x) - /i(x*))x||i = ||e||i = ||y - /i(x)||i, 

which means that x can not be a solution to (IV.2I ) and is certainly not a unique solution to (IV.2b . ■ 
However, direct £q and li minimization may be computationally costly because Iq norma and nonlinear 
/i(-) may lead to non-convex optimization problems. In the next subsection, we introduce our computa- 
tionally efficient iterative sparse recovery algorithm in the general setting when the additive noise v is 
present. 

B. Iterative ii-Minimization Algorithm 

Let y, h{-), x, H, e and v be specified as above; and y = /i(x) + e + v with ||v||2 < e. Now let 
us consider the algorithm which recovers the state variables iteratively. Ideally, an estimate of the state 
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variables, x, can be obtained by solving the following minimization problem, 

min ||y — /i(x) — z||i, 

x,z 

subject to ||z||2 < e. (V.3) 

where x is the optimal solution x. Even though £i norm is a convex function, the function h{-) may 
make the objective function non-convex. 

Since h is nonlinear, we linearize the equations and apply an iterative procedure to obtain a solution. 
We start with an initial state x*^. In the A;-th (k > 1) iteration, let Ay'^ = y — /i(x'^^^), then we solve 
the following convex optimization problem, 

min llAy'^ -F'°""'Ax-z||i, 

Ax,z 

subject to ||z||2 < e, (V.4) 

where H^°'^°'^ is the nxm Jacobian matrix of h evaluated at the point x'^"^. Let Ax'^ denote the optimal 
solution Ax to (IV.4I ). then the state estimation is updated by 

x'^ = x'^-^ + Ax*^. (V.5) 

We repeat the process until Ax'^ approaches close enough or k reaches a specified maximum value. 

Note that when there is no additive noise, we can take e = in this iterative algorithm. When there 
is no additive noise, the algorithm is exactly the same as the state estimation algorithm from lfT2l . 

C. Convergence Conditions for the Iterative Sparse Recovery Algorithm 

In this subsection, we discuss the convergence of the proposed algorithm in Subsection IV-BI First, we 
give a necessary condition (Theorem 15.31 ) for recovering the true state when there is no additive noise, 
and then give a sufficient condition (Theorem 15.41 ) for the iterative algorithm to converge to the true 
state in the absence of additive noise. Secondly, we give the performance bounds (Theorem 15.51 ) for the 
iterative sparse recovery algorithm when there is additive noise. 

Theorem 5.3 (Necessary Condition for Recovering True State): Let y, /i(-), x, H, and e be specified 
as above; and y = /i(x) + e. The iterative algorithm converges to the true state x only if for the Jacobian 
matrix at the point of x and for any x* / 0, ||(i?'°™'x*)^||i < ||(i^'°™'x*)^||i, where K is the 

support of the error vector e. 
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Proof: The proof follows from the proof for Theorem 15.21 with the linear function g(Ax) = /i(x) + 
jjiocai^^^ where i^'"™/ Jacobian matrix at the true state x. ■ 

Theorem 15.31 shows that for nonlinear measurements, the local Jacobian matrix needs to satisfy the 
same condition as the matrix for linear measurements. This assumes that the iterative algorithm starts 
with the correct initial state. However, the iterative algorithm generally does not start the true state x. 
In the following theorem, we give a sufficient condition for the algorithm to converge to the true state 
when there is no additive noise. 

Theorem 5.4 (Guarantee without Additive noise): Let y, h[-), x, H, and e be specified as above; and 
y = /i(x) +e. Suppose that at every point x, the local Jacobian matrix H is full rank and satisfies that for 
every z in the range of H, C||zx||i < ||z-j^||i, where K is the support of the error vector e. Moreover, 
for a fixed constant /? < 1, we assume that 

C-1 Cjl . (Hlocal\ -P^ '^^•"^ 

holds true for any two states xi and X2, where is the local Jacobian matrix at the point xi, 

is a matrix such that /i(x2) — /i(xi) = i/*''"^(x2 — xi), (T^„^(^) is the induced l\ matrix norm for A, 

and o^^^JyA) for a matrix A is defined as = min{||ylz||i : with ||z||i = 1}. 

Then any state x can be recovered correctly from the observation y from the iterative algorithm in 
Subsection IV-BI regardless of the initial starting state of the algorithm. 
Proof: We know that 

y = //i^^Ax* + h{^) + e, (V.7) 

where H^'^^^ is an n x m matrix and Ax* = x — x'^, namely the estimation error at the /c-th step. 
Since at the (A; + l)-th step, we are solving the following optimization problem 

mill ||y-/i(x'=)-F'°'='''Ax||i. (V.8) 

Ax 



Plugging (I V.7 1) into (IV. 8b . this is equivalent to 



mill ||^r*™^Ax* +e-ii''°'="'Ax||i, (V.9) 

Ax 



which we can further write as 



min 1 1 Ax* + (if*™^ - ) Ax* + e - i?''"^'^' Ax 1 1 1 . (V. 1 0) 

Ax 
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We denote (i^*™^ - i?'°^«')Ax* as w, which is the measurement gap generated by using the local 
Jacobian matrix H^"'^"'^ instead of Suppose that the solution is Ax = Ax* — error. Since we 

are looking for the solution which minimizes the objective £i norm, and Ax = Ax* is feasible for the 
optimization problem (IV.8I ). we have 

||_f^«oca/ ^ g^^^^ + w + e||i < ||w + e||i. (V.ll) 

By triangular inequality, we have 

C — 1 

ll^lli + llij'oca/ ^ errorWi - \\w\\i < ||e||i + ||vi^||i. (V.12) 

C + 1 



So 



|//'°™' X errorlli < ^^^^^||w||i. (V.13) 

Cy — 1 



Since error = Ax* - Ax, (x - x^) = Ax*, and x - x''^^ = (x - x'') - (x'^^^ - x'') = Ax* - Ax, 
we have 

||x - x^+i||i ^ 2{C + 1) - H'"""') .^,4. 

||x-x^||i - C-1 cT^,J^'°-0 ' ^ ■ ^ 

where a}^^^{H^^^'^ — H'-"'^"-'-) and cr^j„(-ff'°'^"0 are respectively the matrix quantities defined in the 
statement of the theorem. 
So as long as 

for some constant /3 < 1, the algorithm converges to the true state x and the estimation error eventually 
decreases to 0. ■ 
While the algorithm can converge to the true state when there is no additive noise, the following 
theorem gives the performance bound for the iterative sparse recovery algorithm when there is additive 
noise. 

Theorem 5.5 (Guarantee with Additive noise): Let y, /i(-), x, H, e, and n be specified as above; and 
y = /i(x) + e + V with ||v||2 < e. Suppose that at every point x, the local Jacobian matrix H is full rank 
and satisfies that for every z in the range of H, C||z/^||i < ||z-^||i, where K is the support of the error 
vector e. Moreover, for a fixed constant /3 < 1, we assume that 

2(C+lX„,(g'"'-H'°"') ^ „ 
holds for any two states xi and X2, where //'"^ai jj^g local Jacobian matrix at the point xi, ^ 
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matrix such that /i(x2) — /i(xi) = ff*'""'^(x2 — xi), (T^^^(A) is the induced ii matrix norm for A, and 
(T^j^(j4) for a matrix A is defined as (T^j„(j4) = min{||74z||i : with ||z||i = 1}. 

Then for any true state x, the estimation x'^+^ = x*^ + Ax'^^^, where Ax'^^^ is the solution to the 
(/c + l)-th iteration optimization 



subject to ||z||2 < e (V.17) 



mm 

Ax''+i,z 



satisfies 



|x - x^+i||i<- ?(£±ll__x2^e 



As A: ^ oo, with 2g+^i) </?<!, 



,fc+in 2(C + 1) 

(l-/3)(C7-lVi,,„(F"' 



Proof: The proof follows the same line of reasoning in proving Theorem 15.41 and Theorem 12.11 In 

fact, 

y = //*™^Ax* + h{-K^) + e + V, (V.18) 

where H^^"^"^ is an n x m matrix and Ax* = x — x'^, namely the estimation error at the A;-th step. 
Since at the {k + l)-th step, we are solving the following optimization problem 

mill ||Ay-i?'°'='''Ax-z||i, 

Ax,z 

subject to ||z||2 < e. (V.19) 
Plugging (IV. 181) into (IV.19I I. we are really solving 

min 11/?*™'^ Ax* + e + V - H^^^^^Ax - zlh, 

Ax,z 

subject to ||z||2 < e. (V.20) 
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Denoting (iJ*™^ _ jjiocai-^^^^* 

as w, which is the measurement gap generated by using the local 
Jacobian matrix H^°^"-^ instead of iJ*''"^, then (IV. 201) is equivalent to 

minAx,z ||i?'°'"''(Ax* - Ax) + w + e + v - z||i, 

subject to ||z||2 < e. (V.21) 

Suppose that the solution to (IV. 171 ) is Ax = Ax* — error. We are minimizing the objective li norm, 
and (Ax*, v) is a feasible solution with an objective function value ||w + e||i, so we have 

ll^g-Zoca/ ^ g^^^^ + w + e + V - z||i < ||vi^ + e||i. (V.22) 

By triangular inequality and the property of H^°'^"'\ using the same line of reasoning as in the proof 
of Theorem 12.11 we have 

C — 1 

ll^lli + 7^ error||i - ||w||i - ||v||i - ||z||i 

G + 1 

< ||e||i + ||w||i. (V.23) 

So 

ll^iocai ^ errorWi < ^7^-^(||w||i + ||v||i + ||z||i). (V.24) 
— 1 

Since ||v||i and ||z||i are both no bigger than 2^/ne, using the same reasoning as in the proof of 
Theorem 15.41 we have 

where a^^^ (if*™*^ — H^"^"-'-) and Cmm(-^'°™') respectively the matrix quantities defined in the 
statement of the theorem. 
So as long as 

for some fixed constant /? < 1, the error upper bound converges to (jj-iocai) x 2-y/ne. ■ 
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D. An Example of Sparse Recovery from Nonlinear Measurements 

Now we give an example of sparse recovery from nonlinear measurements. For simplicity, we make 
the measurements corrupted with sparse bad data but not with additive noise. h{-) is a 12-dimensional 
vector as a mapping of two variables x and y, which is given in dV-DI ). We index the 12 measurements 
sequentially from top to bottom as 1,2,..., 12. 



The following theorem shows that this set of nonlinear measurements are able to correct 1 corrupted 
entry in the measurements. 

Lemma 5.6: If {xi,yi) / {x2,y2), \\h{xi,yi) — h{x2,y2)\\o > 3, and so any state can be correctly 
recovered when at most one error is present in the measurements. 

Proof: Suppose that xi — yi = X2 — y2 and xi + yi / ^2 + y2- We further consider two cases. In the 
first case, xi — yi = X2 — y2 = 0, then for index 7, at least one of indices 1 and 2, and at least one of 
indices 9, 10, 1 1 and 12, h{xi,yi) and h{x2,y2) are different. In the second case, xi — yi = X2 — y2 7^ 0, 
then for at least one of indices 1 and 2, h{xi,yi) and h{x2,y2) are different; for at least one of indices 
5 and 7, h{xi,yi) and h{x2,y2) are different; and for at least one of indices 9, 10, 11 and 12, h{xi,yi) 
and h{x2,y2) are different. 

Suppose that xi + yi = X2 + y2 and xi — yi X2 — y2- By symmetry to the previous scenario 
"xi -yi = X2-y2 and xi + yi^ X2 + y2 \ we have \\h{xi,yi) - /i(x2,y2)||o > 3. 

Now we suppose that xi — yi ^ X2 — y2 and xi + yi / X2 + y2, then then for at least one of indices 1 
and 2, h{xi,yi) and h{x2,y2) are different; for at least one of indices 3 and 4, h{xi,yi) and h{x2,y2) 



I 




h{x,y) 



(x + y) cos(x + y) 
(x — y) sin(x — y) 
(x — y) cos(x — y) 
(x + y) sin(x — y) 
(x — y) sin(x + y) 
(x + y) cos(x — y) 
(x — y) cos(x + y) 



X sin(x) 
X cos(x) 
ysm{y) 



\ 



ycos{y) J 
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are different; and for at least one of indices 9, 10, 11 and 12, h{xi,yi) and h{x2,y2) are different. 
Summarizing all these scenarios, if {xi,yi) / {x2,y2), \\h{xi,yi) — h{x2,y2)\\o > 3. ■ 
So this system of nonlinear measurements can guarantee correcting 1 bad data entry. But can we 
efficiently find the true state from bad data using the iterative sparse recovery algorithm in Subsection 
IV-BP To proceed, we first give the Jacobian matrix for h{-) in ( IV.6I ). 

Suppose the true state is = (0.2,0.45), and suppose there is one bad data entry in the measure- 

ments, where we let e = (0,0,0,0,0,1.7783,0,0,0,0,0,0). Suppose that the iterative sparse recovery 
algorithm starts with the initial state {xQ^yo) = (0.1,0.2). Then by definition, at the initial point 
{xo,yo) = (0.1,0.2), the local Jacobian matrix is 

/ 0.,5821 0.,5821 \ 



H 



local 



V 



0.5821 


0.5821 


0.8667 


0.8667 


-0.1993 


0.1993 


0.9850 


-0.9850 


0.1987 


-0.3983 


0.2000 


-0.3911 


1.0250 


0.9651 


0.9849 


-0.9258 


0.1993 





0.9850 








0.3947 





0.9403 



(V.26) 



Then by using the mean value theorem in two variables for the twelve functions in h{-), we can 
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calculate a H 



true 



as 



H 



true 



0.8707 0.8707 \ 
0.6596 0.6596 
-0.3459 0.3459 
0.9515 -0.9515 
-0.6407 
-0.6063 
0.8968 
-0.8013 




V 



0.2932 
0.2984 
1.0709 
0.9683 
0.2975 
0.9651 





(V.27) 





0.6239 
0.8367 / 

For the small example with two variables, we can calculate al^^^{H'^^^'^—H' 
3.9284, and C = 13.5501. So 

2(C + l)aL.(^*™^-^'°'^'^' 



local\ 



1.6590, ai^^XH^ 



local) 



C-1 



0.9792 < 1, 



(V.28) 



0-1 . (fflocal) 

which satisfies the condition appearing in Theorem 15.41 

In fact, in the first iteration, the algorithm outputs (2:1,^1) = (0.1980,0.4392) and ||(xi, j/i) — (x,y)||2 = 
0.0110. After the second iteration, we already get a very good estimation {x2,y2) = (0.2000,0.4500) 
and \\{x2,y2) — {x,y)\\2 = 2.2549 x 10~^. The solution does converge to the true state. 

We note that the convergence condition in Theorem 15.41 is conservative. Sometimes even if the initial 
starting point is far from the true state and the convergence condition fails, the algorithm can still 
converge. For example, now suppose that the true state is at {x,y) = [0.4,1.2] and the iterative sparse 
recovery algorithm still initializes with (xo,yo) = (0.1,0.2). Suppose that the bad data vector is still 
e = (0,0,0,0,0,1.7783,0,0,0,0,0,0). Then again by using the mean value theorem, we calculate a 
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/ sm{x + y) + {x + y) cos{x + y) 
cos{x -\- y) ~ {x + y) sin(x + y) 
sin{x — y) + {x — y) cos(x — y) 
cos{x — y) — {x — y) sin{x — y) 
sm{x — y) + {x + y) cos{x — y) 
sm{x + y) + {x — y) cos{x + y) 
cos{x — y) — {x + y) sin(x — y) 
cos{x + y) — {x — y) sm{x + y) 
sin(a;) + x cos(x) 
cos(a;) — xsin(x) 


V 



sm(a: + y) + [x + y) cos(x + y) \ 
cos(x + y) — {x + y) s\ii{x + y) 

— sin(x — y) — {x — y) cos(x — y) 

— cos{x — y) + (x — y) sm{x — y) 
sm{x — y) — {x + y) cos(x — y) 

— sm(x + y) + {x — y) cos{x + y) 
cos{x — y) + {x + y) sin{x — y) 

— cos{x + y) — {x — y) sin(x + y) 




sm{y) +ycos(y) 

cos(y) - ysin(y) / 



H 



true 



as 



jjtrue 



1.1621 1.1621 \ 

-0.2566 -0.2566 

-0.8055 0.8055 

0.6543 -0.6543 



0.4119 
0.4068 
1.3597 



-1.2413 
-0.8921 
0.4083 



(V.29) 



0.9437 -0.1640 
0.4860 
0.8964 

1.0786 

y 0.2385 J 

In the first iteration, we get a new estimation of the state {xi,yi) = (0.3730,0.7558) and — 
(x, y)||2 = 0.4450. After the second iteration, we get a new estimation {x2,y2) = (0.3995, 1.1468) and 
\\ix2,y2) — {x,y)\\2 = 0.0532. After the third iteration, we get a new estimation (x3, 1/3) = (0.400,1.2003) 
and ||(x3,?/3) — {x,y)\\2 = 2.96 x 10~^. The algorithm converges to the true state even though in the 
first step, (ii'*''"^ - H^°'"'^) = 6.6885, a^i^{H^°^''^) = 3.9284, C = 13.5501 and 

2(C + l)aL.(i^*^"^-^'°'="0 



C-1 



fjl . (fflocal) 
min\ I 



= 3.9478 > 1. 



(V.30) 
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VI. Numerical Results 

In our simulation, we apply (III.6b to estimate an unknown vector from Gaussian linear measurements 
with both sparse errors and noise, and also apply the iterative method to recover state information from 
nonlinear measurements with bad data and noise in a power system. 

Linear System: We first consider recovering a signal vector from linear Gaussian measurements. Let 
m = 60 and n = 150. We generate the measurement matrix iJ"^"^ with i.i.d. A^(0, 1) entries. We also 
generate a vector x G R"^ with i.i.d Gaussian entries and normalize it to ||x||2 = 1. 

We fix the noise level and consider the estimation performance when the number of erroneous mea- 
surements changes. We add to each measurement of Hx. with a Gaussian noise independently drawn 
from A^(0, 0.5^). Let p denote the percentage of erroneous measurements. Given p, we randomly choose 
pn measurements, and each such measurement is added with a Gaussian error independently drawn from 
A^(0,5^). We apply (III.6I ) to estimate x using different choices of A. Let x* denote the estimation of 
X, and the estimation error is represented by ||x* — x||2. We use ( III.6I ) instead of dll. 11 ) in simulation, 
since the recovering algorithm has no prior knowledge of the noise vector, and solving an unconstrained 
optimization problem is more computationally efficient than solving a constrained one. 

Fig. [3] shows how the estimation error changes as p increases, where each result is averaged over one 
hundred and fifty runs. As discussed earlier, when A is large, like A = 18 in this example, (III.6b approaches 
^i-minimization; when A is close to zero, like A = 0.05 here, (III. 61 ) approaches -minimization; when 
A = 8, ( III.6I ) can be viewed as a weighted version of ii and £2 minimization. When p is zero or close 
to one, the measurements only contain i.i.d. Gaussian noises, thus, among the three choices of A, the 
estimation error is relatively small when A = 0.05. When p is away from zero and one, the measurements 
contain both noise and sparse errors, then a weighted version of £1 and £2 minimization (represented by the 
case A = 8) outperforms both ^1 -minimization (approximated by the case A = 18) and ^2 -minimization 
(approximated by the case A = 0.05) in terms of a small estimation error. 

We next consider the recovery performance when the number of erroneous measurements is fixed. 
We randomly choose twelve measurements and add to each such measurement an independent Gaussian 
error from A^(0,5^). Then, we add an independent Gaussian noise from A^(0, o"^) to each one of the n 
measurements. Fig. |4] shows how the estimation enw ||x* — x||2 changes as a increases with different 
choices of A. When a is close to zero, the effect of sparse errors are dominating, thus £1 -minimization 
(approximated by the case A = 18) has the best recovery performance. When a is large, the effect of 
i.i.d. Gaussian noises are dominating, thus £2 -minimization (approximated by the case A = 0.05) has the 
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p 



Fig. 3: Estimation error versus p for Gaussian measurements with fixed noise level 




Fig. 4: Estimation error versus a for Gaussian measurements with fixed percentage of errors 

best recovery performance. In between, a weighted version of li and £2 minimization (represented by 
the case A = 8) has the best performance. 

For a given a, we also apply (III.6I ) with A from 0.05 to 12.05 (step size 0.2), and pick the best A* 
with which the estimation error is minimized. For each a, the result is averaged over three hundred runs. 
Fig. [5] shows the curve of A* against a. When the percentage of measurements with bad data is fixed, 
(p = 12/100 = 0.12 here,) A* decreases as the noise level increases. 

Power System: We also consider estimating the state of a power system from available measurements 
and known system configuration. The state variables are the voltage magnitudes and the voltage angles at 
each bus. The measurements can be the real and reactive power injections at each bus, and the real and 
reactive power flows on the lines. All the measurements are corrupted with noise, and a small fraction 
of the measurements contains errors. We would like to estimate the state variables from the corrupted 
measurements. 

The relationship between the measurements and the state variables for a A;' -bus system can be stated 
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Fig. 5: A* versus a for Gaussian measurements 



as follows mi: 

k' 

p, = ^£;,£;,yycos(% + ^, -<5j), (vi.i) 

i=i 

k' 

Q, = ^£;,£;,y,,sin(%+,5,-J,), (VI.2) 

J=l 

Pij = E,EjY,j cos(% + (5j - 

-£;fr„cos% + £;2r,,cos0,, i^j, (vi.3) 

Q,j = KjE'jKy sin(% + (5i - 

-Sfr,, sin + Sfn, sin 0,, z ^ J, (VI.4) 

where Pi and are the real and reactive power injection at bus i respectively, Pij and Qij are the real 
and reactive power flow from bus i to bus j, Ei and 5i are the voltage magnitude and angle at bus i. 
Yij and 9ij are the magnitude and phase angle of admittance from bus i to bus j, Ysi and Ogi are the 
magnitude and angle of the shunt admittance of line at bus i. Given a power system, all Yij, Oij, Ysi and 
9si are known. 

For a fc'-bus system, we treat one bus as the reference bus and set the voltage angle at the reference 
bus to be zero. There are m = 2k' — 1 state variables with the first k' variables for the bus voltage 
magnitudes Ei and the rest k' — I variables for the bus voltage angles 9i. Let x G R"^ denote the state 
variables and let y G i?" denote the n measurements of the real and reactive power injection and power 
flow. Let V G i?" denote the noise and e G i?" denote the sparse error vector. Then we can write the 
equations in a compact form, 

y = /i(x) + V + e, (VI.5) 
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where h{-) denotes n nonlinear functions defined in (IVI.ll ) to (IVI.4I ). 

We use tlie iterative algorithm introduced in Subsection IV-BI to recover x from y. We start with 
the initial state x° where = 1 for all i G {l,...,n}, and = for all i G {n + l,...,2n — 1}. 
Since we assume no knowledge of the magnitude of v and unconstrained problem is generally more 
computationally efficient than a constrained one, in the kth iteration, instead of solving ( IV.4I ). we solve 
the following unconstrained convex optimization problem 

min llAy'^ -ij'°^"'Ax-z||i + Allzlb, (VI.6) 

Ax,z 

where H^°'^°'^ is the Jacobian matrix of h evaluated at x*'"^. Let Ax'^ denote the optimal solution of Ax 
to (IVI.6I ). then the state estimation is updated by 

We repeat the process until Ax'^ is close to 0, or the number of iteration reaches a specified value. 

We evaluate the performance on the IEEE 30-bus test system. Fig. [6] shows the structure of the test 
system. Then the state vector contains 59 variables. We take n = 100 measurements including the 
real and reactive power injection at each bus and some of the real and reactive power flows on the 
lines, evaluate how the estimation performance changes as the noise level increases when the number of 



December 30, 2011 



DRAFT 



27 




"0 0.02 0.04 0.06 0.08 0.1 



Fig. 7: Estimation error versus a with fixed percentage of errors in power system 

erroneous measurements is fixed, p is fixed to be 0.06, and we randomly choose a set T with cardinality 
|T| = pn = 6. Each measurement with its index in T contains a Gaussian error independently drawn 
from A^(0, 0.7^). Each measurement also contains a Gaussian noise independently drawn from A^(0, u^). 
For a fixed noise level a, we use the above mentioned iterative procedure to recover the state vector 
X. The result is averaged over two hundred runs. Fig. |7] shows the estimation error ||x* — x||2 against 
a when p = 0.06. Between A = 12 (approximating £i -minimization) and A = 0.5 (approximating I2- 
minimization), the former one has a better recovery performance when the noise level a is small, and the 
latter one has a better performance when a is large. Moreover, the recovery performance when A = 5 in 
general outperforms that when A is either large (A = 12) or small (A = 0.05). 

VII. Conclusion 

In this paper, we studied sparse recovery from nonlinear measurements with applications in state 
estimation for power networks from nonlinear measurements corrupted with bad data. An iterative mixed 
li and ^2 convex programming was proposed for state estimation by locally linearizing the nonlinear 
measurements. By studying the almost Euclidean property for a linear subspace, we gave a new state 
estimation error bound when the measurements are linear and the measurements are corrupted with both 
bad data and by additive noise. When the measurements are nonlinear and corrupted with bad data, we 
gave conditions under which the solution of the iterative algorithm converges to the true state even though 
local linearizing of measurements may not be accurate. We numerically evaluated the iterative convex 
programming approach performance in bad data detection for nonlinear electrical power networks. As a 
byproduct, we provided sharp bounds on the almost Euclidean property of a linear subspace, using the 
"escape-through-a-mesh" theorem from geometric functional analysis. 
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